planner: pre-refactor for join reorder conflict detection algorithm#68870
planner: pre-refactor for join reorder conflict detection algorithm#68870guo-shaoge wants to merge 9 commits into
Conversation
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
…nto cp_pre_cdc_impl
…nto cp_pre_cdc_impl
|
This cherry pick PR is for a release branch and has not yet been approved by triage owners. To merge this cherry pick:
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds nested ChangesLEADING Hint Parsing and Propagation
Join-order Utility Library and Rule Refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/util/hint/hint.go (1)
955-983:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClear
LeadingListwhen LEADING is invalidated.This branch invalidates conflicting
LEADINGhints by emptyingLeadingJoinOrder, but it still returns the parsedLeadingList. The new join-order helpers readPlanHints.LeadingListdirectly, soLEADING + STRAIGHT_JOINor multipleLEADINGhints can still influence join-order selection after being marked invalid here.Suggested fix
if leadingHintCnt > 1 || (leadingHintCnt > 0 && straightJoinOrder) { // If there are more leading hints or the straight_join hint existes, all leading hints will be invalid. leadingJoinOrder = leadingJoinOrder[:0] + leadingList = nil if leadingHintCnt > 1 { warnHandler.SetHintWarning("We can only use one leading hint at most, when multiple leading hints are used, all leading hints will be invalid") } else if straightJoinOrder { warnHandler.SetHintWarning("We can only use the straight_join hint, when we use the leading hint and straight_join hint at the same time, all leading hints will be invalid") } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/util/hint/hint.go` around lines 955 - 983, The code invalidates LEADING by clearing leadingJoinOrder but still returns LeadingList, so PlanHints.LeadingList remains populated and can affect join-order; update the branch that checks leadingHintCnt and straightJoinOrder (the block manipulating leadingJoinOrder and warnHandler.SetHintWarning) to also clear leadingList (or set PlanHints.LeadingList to empty) when LEADING is deemed invalidated so the returned PlanHints has LeadingList cleared alongside LeadingJoinOrder and the same warning behavior is preserved.
🧹 Nitpick comments (1)
pkg/expression/schema.go (1)
86-100: ⚡ Quick winClarify that
Equalonly compares ordered columns.The implementation is narrower than the name/comment suggest: it ignores
KeysandUniqueKeys. That is fine for the new join-reorder caller, but the exported contract should say so explicitly or use a more specific name to avoid future misuse.As per coding guidelines, "Keep exported-symbol doc comments, and prefer semantic constraints over name restatement".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/expression/schema.go` around lines 86 - 100, The Schema.Equal method currently only compares ordered Columns and ignores Keys and UniqueKeys, which is narrower than its name suggests; update the exported doc comment for Schema.Equal to explicitly state it compares only the ordered Columns (and not Keys or UniqueKeys) or rename the method to a more specific name (e.g., EqualColumnsOrdered) to avoid misuse; locate the method by the symbol Schema.Equal and modify its comment to describe the exact contract (ordered column-wise equality) and, if renaming, update all call sites to use the new name.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/parser/ast/misc.go`:
- Around line 3967-4005: The bug is that when the first LeadingList item is
itself a nested LeadingList, RestoreWithQB pushes the hint-level QB into the
child causing LEADING((`@qb` a, b), c) instead of LEADING(`@qb` (a, b), c); modify
LeadingList.RestoreWithQB so that in the case "case *LeadingList" if i == 0 and
currentQBName.L != "" and !qbOnTable you emit the "@<qb>"
(ctx.WriteKeyWord("@"); ctx.WriteName(currentQBName.String()); ctx.WritePlain("
")) before restoring the child and then call t.RestoreWithQB with an empty
model.CIStr for qbName (so the child does not receive the QB), afterwards clear
currentQBName; otherwise keep the existing behavior of passing currentQBName
into the child.
In `@pkg/parser/hintparser_test.go`:
- Around line 369-474: The LEADING hint test cases use ast.NewCIStr but NewCIStr
is defined in the model package; update the LEADING test block to replace all
ast.NewCIStr(...) occurrences with model.NewCIStr(...). Specifically modify the
HintName fields and all HintTable.TableName constructors in the
TableOptimizerHint / LeadingList test cases so they use model.NewCIStr, leaving
the surrounding structures (TableOptimizerHint, LeadingList, HintTable,
HintName, Tables) unchanged.
In `@pkg/planner/core/joinorder/util.go`:
- Around line 221-225: The fallback path in util.go currently computes dbMatch
as "astTbl.DBName.L == '' || astTbl.DBName.L == blockName.DBName.L", which drops
the "*" DB wildcard semantics; update the dbMatch logic used in the blockOffset
> 1 fallback so it treats "*" as a wildcard (e.g., accept when astTbl.DBName.L
== "*" or blockName.DBName.L == "*") in addition to the empty or exact-match
cases, keeping tableMatch unchanged; modify the dbMatch check near variables
blockOffset, queryBlockNames, blockName, astTbl, dbMatch so LEADING(*.alias)
still matches derived-table aliases in this fallback path.
---
Outside diff comments:
In `@pkg/util/hint/hint.go`:
- Around line 955-983: The code invalidates LEADING by clearing leadingJoinOrder
but still returns LeadingList, so PlanHints.LeadingList remains populated and
can affect join-order; update the branch that checks leadingHintCnt and
straightJoinOrder (the block manipulating leadingJoinOrder and
warnHandler.SetHintWarning) to also clear leadingList (or set
PlanHints.LeadingList to empty) when LEADING is deemed invalidated so the
returned PlanHints has LeadingList cleared alongside LeadingJoinOrder and the
same warning behavior is preserved.
---
Nitpick comments:
In `@pkg/expression/schema.go`:
- Around line 86-100: The Schema.Equal method currently only compares ordered
Columns and ignores Keys and UniqueKeys, which is narrower than its name
suggests; update the exported doc comment for Schema.Equal to explicitly state
it compares only the ordered Columns (and not Keys or UniqueKeys) or rename the
method to a more specific name (e.g., EqualColumnsOrdered) to avoid misuse;
locate the method by the symbol Schema.Equal and modify its comment to describe
the exact contract (ordered column-wise equality) and, if renaming, update all
call sites to use the new name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 362404df-e24a-4cdb-ab10-6a67cd93d0ba
📒 Files selected for processing (14)
pkg/expression/schema.gopkg/parser/ast/misc.gopkg/parser/ast/misc_test.gopkg/parser/hintparser.gopkg/parser/hintparser.ypkg/parser/hintparser_test.gopkg/parser/parser_test.gopkg/planner/core/BUILD.bazelpkg/planner/core/joinorder/BUILD.bazelpkg/planner/core/joinorder/util.gopkg/planner/core/operator/logicalop/logical_projection.gopkg/planner/core/plan_cost_ver2.gopkg/planner/core/rule_join_reorder.gopkg/util/hint/hint.go
| func (lt *LeadingList) RestoreWithQB(ctx *format.RestoreCtx, qbName model.CIStr, needParen bool, isTop bool, qbOnTable bool) error { | ||
| if lt == nil || len(lt.Items) == 0 { | ||
| return nil | ||
| } | ||
| if needParen { | ||
| ctx.WritePlain("(") | ||
| } | ||
|
|
||
| currentQBName := qbName // hint level QBName | ||
|
|
||
| for i, item := range lt.Items { | ||
| if i > 0 { | ||
| ctx.WritePlain(", ") | ||
| } | ||
|
|
||
| switch t := item.(type) { | ||
| case *HintTable: | ||
| if i == 0 && currentQBName.L != "" && !qbOnTable { | ||
| ctx.WriteKeyWord("@") | ||
| ctx.WriteName(currentQBName.String()) | ||
| ctx.WritePlain(" ") | ||
| t.Restore(ctx) | ||
| currentQBName = model.CIStr{} | ||
| } else { | ||
| t.Restore(ctx) | ||
| } | ||
| case *LeadingList: | ||
| if err := t.RestoreWithQB(ctx, currentQBName, true, false, qbOnTable); err != nil { | ||
| return err | ||
| } | ||
| currentQBName = model.CIStr{} | ||
| default: | ||
| return fmt.Errorf("unexpected type in LeadingList: %T", t) | ||
| } | ||
| } | ||
| if needParen { | ||
| ctx.WritePlain(")") | ||
| } | ||
| return nil |
There was a problem hiding this comment.
Preserve the hint-level QB before a parenthesized first LEADING group.
When the first LEADING element is nested, Line 3994 pushes QBName into the child list, so Restore emits LEADING((@QB a, b), c) instead of LEADING(@QB (a, b), c). That breaks restore/parse round-tripping for valid hints with a hint-level query block.
Suggested fix
case *LeadingList:
- if err := t.RestoreWithQB(ctx, currentQBName, true, false, qbOnTable); err != nil {
+ childQBName := currentQBName
+ if i == 0 && childQBName.L != "" && !qbOnTable {
+ ctx.WriteKeyWord("@")
+ ctx.WriteName(childQBName.String())
+ ctx.WritePlain(" ")
+ childQBName = model.CIStr{}
+ }
+ if err := t.RestoreWithQB(ctx, childQBName, true, false, qbOnTable); err != nil {
return err
}
currentQBName = model.CIStr{}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/parser/ast/misc.go` around lines 3967 - 4005, The bug is that when the
first LeadingList item is itself a nested LeadingList, RestoreWithQB pushes the
hint-level QB into the child causing LEADING((`@qb` a, b), c) instead of
LEADING(`@qb` (a, b), c); modify LeadingList.RestoreWithQB so that in the case
"case *LeadingList" if i == 0 and currentQBName.L != "" and !qbOnTable you emit
the "@<qb>" (ctx.WriteKeyWord("@"); ctx.WriteName(currentQBName.String());
ctx.WritePlain(" ")) before restoring the child and then call t.RestoreWithQB
with an empty model.CIStr for qbName (so the child does not receive the QB),
afterwards clear currentQBName; otherwise keep the existing behavior of passing
currentQBName into the child.
| if blockOffset > 1 && blockOffset < len(queryBlockNames) { | ||
| blockName := queryBlockNames[blockOffset] | ||
| dbMatch := astTbl.DBName.L == "" || astTbl.DBName.L == blockName.DBName.L | ||
| tableMatch := astTbl.TableName.L == blockName.TableName.L | ||
| if dbMatch && tableMatch { |
There was a problem hiding this comment.
Preserve * DB wildcard in the subquery-alias fallback.
The fallback path drops the wildcard DB semantics that direct matching already supports. LEADING(*.alias) can match a derived-table alias in step 1, but the same binding stops matching in step 2 because dbMatch only accepts empty or exact DB names here.
Suggested fix
- dbMatch := astTbl.DBName.L == "" || astTbl.DBName.L == blockName.DBName.L
+ dbMatch := astTbl.DBName.L == "" || astTbl.DBName.L == blockName.DBName.L || astTbl.DBName.L == "*"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/planner/core/joinorder/util.go` around lines 221 - 225, The fallback path
in util.go currently computes dbMatch as "astTbl.DBName.L == '' ||
astTbl.DBName.L == blockName.DBName.L", which drops the "*" DB wildcard
semantics; update the dbMatch logic used in the blockOffset > 1 fallback so it
treats "*" as a wildcard (e.g., accept when astTbl.DBName.L == "*" or
blockName.DBName.L == "*") in addition to the empty or exact-match cases,
keeping tableMatch unchanged; modify the dbMatch check near variables
blockOffset, queryBlockNames, blockName, astTbl, dbMatch so LEADING(*.alias)
still matches derived-table aliases in this fallback path.
Codecov Report❌ Patch coverage is Please upload reports for the commit 0af91a8 to get more accurate results. Additional details and impacted files@@ Coverage Diff @@
## release-8.5 #68870 +/- ##
================================================
Coverage ? 42.2591%
================================================
Files ? 1536
Lines ? 429701
Branches ? 0
================================================
Hits ? 181588
Misses ? 231936
Partials ? 16177
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
[LGTM Timeline notifier]Timeline:
|
…_impl Signed-off-by: guo-shaoge <shaoge1994@163.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/planner/core/rule_join_reorder.go`:
- Around line 449-474: Remove the dead duplicate function
checkAndGenerateLeadingHint from this file: the project already uses
joinorder.CheckAndGenerateLeadingHint(joinOrderHintInfo) (see usage at the other
call site), so delete the local checkAndGenerateLeadingHint definition to avoid
duplicate/unreferenced code; ensure no other local references to
checkAndGenerateLeadingHint remain and run tests/compile to confirm nothing else
relied on this duplicate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7a9ba0bb-3c35-4ac1-913e-5b9d4467fa69
📒 Files selected for processing (4)
pkg/parser/hintparser.ypkg/parser/parser_test.gopkg/planner/core/BUILD.bazelpkg/planner/core/rule_join_reorder.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/planner/core/BUILD.bazel
- pkg/parser/parser_test.go
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/planner/core/rule_join_reorder.go`:
- Around line 449-474: Remove the dead duplicate function
checkAndGenerateLeadingHint from this file: the project already uses
joinorder.CheckAndGenerateLeadingHint(joinOrderHintInfo) (see usage at the other
call site), so delete the local checkAndGenerateLeadingHint definition to avoid
duplicate/unreferenced code; ensure no other local references to
checkAndGenerateLeadingHint remain and run tests/compile to confirm nothing else
relied on this duplicate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7a9ba0bb-3c35-4ac1-913e-5b9d4467fa69
📒 Files selected for processing (4)
pkg/parser/hintparser.ypkg/parser/parser_test.gopkg/planner/core/BUILD.bazelpkg/planner/core/rule_join_reorder.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/planner/core/BUILD.bazel
- pkg/parser/parser_test.go
🛑 Comments failed to post (1)
pkg/planner/core/rule_join_reorder.go (1)
449-474:
⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDead code: local
checkAndGenerateLeadingHintduplicatesjoinorder.CheckAndGenerateLeadingHintand is never called.Line 342 uses
joinorder.CheckAndGenerateLeadingHint(joinOrderHintInfo), while this local function has an identical implementation and signature but is never invoked.🧹 Remove dead code
-// checkAndGenerateLeadingHint used to check and generate the valid leading hint. -// We are allowed to use at most one leading hint in a join group. When more than one, -// all leading hints in the current join group will be invalid. -// For example: select /*+ leading(t3) */ * from (select /*+ leading(t1) */ t2.b from t1 join t2 on t1.a=t2.a) t4 join t3 on t4.b=t3.b -// The Join Group {t1, t2, t3} contains two leading hints includes leading(t3) and leading(t1). -// Although they are in different query blocks, they are conflicting. -// In addition, the table alias 't4' cannot be recognized because of the join group. -func checkAndGenerateLeadingHint(hintInfo []*h.PlanHints) (*h.PlanHints, bool) { - leadingHintNum := len(hintInfo) - var leadingHintInfo *h.PlanHints - hasDiffLeadingHint := false - if leadingHintNum > 0 { - leadingHintInfo = hintInfo[0] - // One join group has one leading hint at most. Check whether there are different join order hints. - for i := 1; i < leadingHintNum; i++ { - if hintInfo[i] != hintInfo[i-1] { - hasDiffLeadingHint = true - break - } - } - if hasDiffLeadingHint { - leadingHintInfo = nil - } - } - return leadingHintInfo, hasDiffLeadingHint -}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/rule_join_reorder.go` around lines 449 - 474, Remove the dead duplicate function checkAndGenerateLeadingHint from this file: the project already uses joinorder.CheckAndGenerateLeadingHint(joinOrderHintInfo) (see usage at the other call site), so delete the local checkAndGenerateLeadingHint definition to avoid duplicate/unreferenced code; ensure no other local references to checkAndGenerateLeadingHint remain and run tests/compile to confirm nothing else relied on this duplicate.
|
@guo-shaoge: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@guo-shaoge: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
…_impl Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: AilinKid The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: close #66088
Problem Summary: manually cherry pick #66087
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Improvements
Tests